LNCS3180 



■ Fernando Galindo 
Makoto Takizawa 
Roland Traunmiiller (Eds.) 



Database and Expert 
Systems Applications 

15th International Conference, DEXA 2004 
Zaragoza, Spain, August/September 2004 
Proceedings 



DEXA 2004 



^ Springer 




Lecture Notes in Computer Science 

Commenced Publication in 1 973 
Founding and Former Series Editors: 

Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen 



Editorial Board 

David Hutchison 

Lancaster University, UK 
Takeo Kanade 

Carnegie Mellon University, Pittsburgh, PA, USA 
Josef Kittler 

University of Surrey, Guildford, UK 
Jon M. Kleinberg 

Cornell University, Ithaca, NY, USA 
Friedemann Mattern 

ETH Zurich, Switzerland 
John C. Mitchell 

Stanford University, CA, USA 
Moni Naor 

Weizmann Institute of Science, Rehovot, Israel 
Oscar Nierstrasz 

University of Bern, Switzerland 
C. Pandu Rangan 

Indian Institute of Technology, Madras, India 
Bernhard Steffen 

University of Dortmund, Germany 
Madhu Sudan 

Massachusetts Institute of Technology, MA, USA 
Demetri Terzopoulos 

New York University, NY, USA 
Doug Tygar 

University of California, Berkeley, CA, USA 
Moshe Y. Vardi 

Rice University, Houston, IX, USA 
Gerhard Weikum 

Max-Planck Institute of Computer Science, Saarbruecken, Germany 



3180 




Fernando Galindo Makoto Takizawa 
Roland Traunmiiller (Eds.) 



Database and Expert 
Systems Applications 



15th International Conference, DEXA 2004 
Zaragoza, Spain, August 30 - September 3, 2004 
Proceedings 




Springer 




Volume Editors 



Fernando Galindo 
University of Zaragoza 

Ciudad Universitaria, Plaza San Francisco, 50009 Zaragoza, Spain 
E-mail: cfa@unizar.es 

Makoto Takizawa 
Tokyo Denki University 

Ishizaka, Hatoyama-machi, Hiki-gun, 350-0394 Saitama, Japan 
E-mail: taki@takilab.k.dendai. ac.jp 

Roland Traunmiiller 

University of Linz, Institute of Informatics in Business and Government 
Altenbergerstr. 69, 4040 Linz, Austria, 

E-mail: traunm@ifs.uni-linz.ac.at 



Library of Congress Control Number: 20041 10971 



CR Subject Classification (1998): H.2, H.4, H.3, H.5, 1.2, J.l 
ISSN 0302-9743 

ISBN 3-540-22936-1 Springer Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer. Violations are liable 
to prosecution under the German Copyright Law. 

Springer is a part of Springer Science+Business Media 

springeronline.com 

(c) Springer-Verlag Berlin Heidelberg 2004 
Printed in Germany 

Typesetting: Camera-ready by author, data conversion by Christian Grosche, Hamburg 
Printed on acid-free paper SPIN: 11310143 06/3142 5 4 3 2 1 0 




Preface 



DEXA 2004, the 15th International Conference on Database and Expert Systems 
Applications, was held August 30 - September 3, 2004, at the University of 
Zaragoza, Spain. The quickly growing spectrum of database applications has led to 
the establisment of more specialized discussion platforms (DaWaK Conference, 
EC-Web Conference, EGOVConference, Trustbus Conference and DEXA 
Workshop: Every DEXA event has its own conference proceedings), which were 
held in parallel with the DEXA Conference also in Zaragoza. 

In your hands are the results of much effort. The work begins with the preparation 
of the submitted papers, which then go through the reviewing process. The 
accepted papers are revised to final versions by their authors and are then arranged 
within the conference program. All culminates in the conference itself. For this 
conference 304 papers were submitted, and I want to thank to all who contributed 
to it; they are the real base of the conference. The program committee and the 
supporting reviewers produced altogether 942 referee reports, in average 3,1 
reports per paper, and selected 92 papers for presentation. 

At this point we would like to say many thanks to all the institutions that actively 
supported this conference and made it possible. These were: 

• University of Zaragoza 

• FAW 

• DEXA Association 

• Austrian Computer Society 

A conference like DEXA would not be possible without the enthusiastic 
employment of several people in the background. First we want to thank to the 
whole program committee for the thorough review process. Many thanks also to 
Maria Schweikert (Technical University, Vienna), Andreas Dreiling (FAW, 
University of Linz) and Monika Neubauer (FAW, University of Linz). Special 
thanks go to Gabriela Wagner. She is Scientific Event Manager in charge of the 
DEXA organization and has organized the whole DEXA event. The editors express 
their high appreciation of her outstanding dedication. The scientific community 
appreciates the way she helps authors and participants whenever necessary. 



June 2004 



Fernando Galindo 
Makoto Takizawa 
Roland Traunmuller 




Program Committee 



General Chairperson: 

Fernando Galindo, University of Zaragoza, Spain 

Conference Program Chairpersons: 

Makoto Takizawa, Tokyo Denki University, Japan 
Roland Traunmuller, University of Linz, Austria 

Workshop Chairpersons: 

A Min Tjoa, Technical University of Vienna, Austria 
Roland R. Wagner, FAW, University of Linz, Austria 

Program Committee Members: 

Witold Abramowicz, The Poznan University of Economics, Poland 

Michel Adiba, IMAG - Laboratoire LSR, France 

Hamideh Afsarmanesh , University of Amsterdam, The Netherlands 

Ala Al-Zobaidie, University of Greenwich, UK 

Walid G. Aref, Purdue University, USA 

Ramazan S. Aygun, University of Alabama in Fluntsville, USA 

Kurt Bauknecht, University of Zurich, Switzerland 

Trevor Bench-Capon, University of Liverpool, UK 

Elisa Bertino, University of Milan, Italy 

Alfs Berztiss, University of Pittsburgh, USA 

Bishwaranjan Bhattacharjee, IBM T.J. Watson Research Center, USA 

Sourav S Bhowmick, Nanyang Technological University, Singapore 

Christian Bohm, University of Munich, Germany 

Alex Borgida, Rutgers University, USA 

Omran Bukhres, Purdue University School of Science, USA 

Luis Camarinah-Matos, New University of Lisbon, Portugal 

Antonio Cammelli, CNR, Italy 

Malu Castellanos, Hewlett-Packard Laboratories, USA 

Tiziana Catarci, University of Rome “La Sapienza”, Italy 

Wojciech Cellary, University of Economics at Poznan, Poland 

Elizabeth Chang, Curtin University, Australia 

Sudarshan S. Chawathe, University of Maryland, USA 

Ming-Syan Chen, National Taiwan University, Taiwan 

Paolo Ciaccia, University of Bologna, Italy 

Rosine Cicchetti, IUT, University of Marseille, France 

Carlo Combi, University of Verona, Italy 

Brian Frank Cooper, Georgia Institute of Technology, USA 

Isabel Cruz, University of Illinois at Chicago, USA 

John Debenham , University of Technology, Sydney, Australia 

Misbah Deen, University of Keele, UK 




VIII Program Committee 

Stefan Dessloch, University of Kaiserslauern, Germany 

Elisabetta Di Nitto, Politecnico di Milano, Italy 

Nina Edelweiss, Universidade Federal do Rio Grande do Sul, Brazil 

Johann Eder, University of Klagenfurt, Austria 

Gregor Engels, University of Paderborn, Germany 

Peter Fankhauser, Fraunhofer IPSI, Germany 

Ling Feng, University of Twente, The Netherlands 

Eduardo Fernandez, Florida Atlantic University, USA 

Simon Field, Matching Systems Ltd., Switzerland 

Burkhard Freitag, University of Passau, Germany 

Mariagrazia Fugini, Politecnico di Milano, Italy 

Irini Fundulaki, Bell Laboratories, Lucent Technologies, USA 

Antonio L. Furtado, Pontificia Universidade Catolica do R.J., Brazil 

Manolo Garcia-Solaco, IS Consultant, USA 

Georges Gardarin, University of Versailles, France 

Alexander Gelbukh, CIC, Instituto Politecnico Nacional (IPN), Mexico 

Parke Godfrey, The College of William and Mary, Canada 

Paul Grefen, Eindhoven University of Technology, The Netherlands 

William Grosky, University of Michigan, USA 

Le Gruenwald, University of Oklahoma, USA 

Abdelkader Hameurlain, University of Toulouse, France 

Igor T. Hawryszkiewycz, University of Technology, Sydney, Australia 

Wynne Hsu, National University of Singapore, Singapore 

Mohamed Ibrahim, University of Greenwich, UK 

H.-Arno Jacobsen, University of Toronto, Canada 

Yahiko Kambayashi, Kyoto University, Japan 

Gerti Kappel, Vienna University of Technology, Austria 

Dimitris Karagiannis, University of Vienna, Austria 

Randi Karlsen, University of Tromso, Norway 

Rudolf Keller, Ziihlke Engineering AG, Switzerland 

Latifur Khan, University of Texas at Dallas, USA 

Myoung Ho Kim, KAIST, Korea 

Masam Kitsuregawa, Tokyo University, Japan 

Gary J. Koehler, University of Florida, USA 

Nick Koudas, AT&T Labs Research, USA 

John Krogstie, SINTEF, Norway 

Petr Kroha , Technical University Chemnitz-Zwickau, Germany 

Josef Kiing, FAW, University of Linz, Austria 

Lotfi Lakhal, University of Marseille, France 

Christian Lang, IBM T.J. Watson Research Center, USA 

Jiri Lazansky, Czech Technical University, Czech Republic 

Young-Koo Lee, University of Illinois, USA 

Mong Li Lee, National University of Singapore, Singapore 

Michel Leonard, University of Geneva, Switzerland 

Tok Wang Ling, National University of Singapore, Singapore 

Volker Linnemann, University of Luebeck, Germany 

Mengchi Liu, Carleton University, Canada 




Program Comittee 



IX 



Peri Loucopoulos, UMIST, UK 

Sanjai Kumar Madria, University of Missouri-Rolla, USA 

Akifumi Makinouchi, Kyushu University, Japan 

Vladimir Marik, Czech Technical University, Czech Republic 

Simone Marinai, University of Florence, Italy 

Heinrich C. Mayr, University of Klagenfurt, Austria 

Subhasish Mazumdar, New Mexico Tech, USA 

Dennis McLeod, University of Southern California, USA 

Elisabeth Metais, CNAM, France 

Mukesh Mohania, IBM-1RL, India 

Reagan Moore, San Diego Supercomputer Center, USA 

Tadeusz Morzy, Poznan University of Technology, Poland 

Noureddine Mouaddib, University of Nantes, France 

Gunter Muller, University of Freiburg, Germany 

Felix Naumann, Humboldt-Universitat zu Berlin, Germany 

Erich J. Neuhold, GMD-IPSI, Germany 

Wilfried Ng, University of Science and Technology, Hong Kong 

Matthias Nicola, IBM Silicon Valley Lab, USA 

Shojiro Nishio, Osaka University, Japan 

Gultekin Ozsoyoglu, University Case Western Research, USA 

Georgios Pangalos, University of Thessaloniki, Greece 

Dimitris Papadias, University of Science and Technology, Hong Kong 

Stott Parker, University of Los Angeles (UCLA), USA 

Oscar Pastor, Universidad Politecnica de Valencia, Spain 

Jignesh M. Patel, University of Michigan, USA 

Glenn Paulley, iAnywhere Solutions (A Sybase Company), Canada 

Veronika Peralta, Universidad de la Republica, Uruguay 

Gunter Pernul, University of Regensburg, Germany 

Evaggelia Pitoura, University of Ioannina, Greece 

Alexandra Poulovassilis, University of London, UK 

Calton Pu, Georgia Institute of Technology, USA 

Gerald Quirchmayr, University of Vienna, Austria, 

and University of South Australia, Australia 

Fausto Rabitti, CNUCE-CNR, Italy 

Wenny Rahayu, La Trobe University, Australia 

Isidro Ramos, Technical University of Valencia, Spain 

P. Krishna Reddy, International Institute of Information Technology, India 

Werner Retschitzegger, University of Linz, Austria 

Norman Revell, Middlesex University, UK 

Sally Rice, University of South Australia, Australia 

Philippe Rigaux, University of Paris Sud, France 

John Roddick, Flinders University of South Australia, Australia 

Colette Rolland, University Paris I, Sorbonne, France 

Armin Roth, DaimlerChrysler AG, Germany 

Elke Rundensteiner, Worcester Polytechnic Institute, USA 

Domenico Sacca, University of Calabria, Italy 

Arnaud Sahuguet, Bell Laboratories, Lucent Technologies, USA 




X Program Committee 



Simonas Saltenis, Aalborg University, Denmark 
Marinette Savonnet, Universite de Bourgogne, France 
Erich Schweighofer, University of Vienna, Austria 
Ming-Chien Shan, Hewlett-Packard Laboratories, USA 
Keng Siau, University of Nebraska-Lincoln, USA 

Michael H. Smith, University of Calgary, Canada, and University of California, USA 

Giovanni Soda, University of Florence, Italy 

Uma Srinivasan, CSIRO, Australia 

Bala Srinivasan, Monash University, Australia 

Olga Stepankova, Czech Technical University, Czech Republic 

Zbigniew Struzik, The University of Tokyo, Japan 

Katsumi Tanaka, Kyoto University, Japan 

Zahir Tari, University of Melbourne, Australia 

Stephanie Teufel, University of Fribourg, Switzerland 

Jukka Teuhola, University of Turku, Finland 

Bernd Thalheim, University of Kiel, Germany 

J.M. Thevenin, University of Toulouse, France 

Helmut Thoma, IBM Global Services Basel, Switzerland 

A Min Tjoa, Technical University of Vienna, Austria 

Aphrodite Tsalgatidou, University of Athens, Greece 

Susan Urban, Arizona State University, USA 

Genoveva Vargas-Solar, LSR-IMAG, France 

Krishnamurthy Vidyasankar, Memorial University of Newfoundland, Canada 

Pavel Vogel, Technical University Munich, Germany 

Roland Wagner, FAW, University of Linz, Austria 

Kyu- Young Whang, KAIST, Korea 

Michael Wing, Middlesex University, UK 

Vilas Wuwongse, Asian Institute of Technology, Thailand 

Gian Piero Zarri, CNRS, France 

Arkady Zaslavsky, Monash University, Australia 




External Reviewers 



Miguel R. Penabad 
Manuel Montes-y-Gomez 
Angeles Saavedra-Places 
Hiram Calvo-Castro 
Ma Luisa Carpente 
Nieves R. Brisaboa 
Fabrizio Angiulli 
Eugenio Cesario 
Massimo Cossentino 
Alfredo Cuzzocrea 
Sergio Flesca 
Elio Masciari 
Massimiliano Mazzeo 
Luigi Pontieri 
Andrea Tagarelli 
Ioana Stanoi 
George Mihaila 
Min Wang 
Qiankun Zhao 
Ling Chen 
Sandeep Prakash 
Ersin Kaletas 
Ozgul Unal 

Ammar Benabdelkader 
Victor Guevara Masis 
Jens Bleiholder 
Melanie Weis 
Lars Rosenhainer 
Sarita Bassil 
Adriana Marotta 
Regina Motz 
Xiaohui Xue 
Dimitre Kostadinov 
Yufei Tao 
Nikos Mamoulis 
Xiang Lian 
Kyriakos Mouratidis 
Linas Bukauskas 
Alminas Civilis 
Vicente Pelechano 
Juan Sanchez 
Joan Fons 
Manoli Albert 



Silvia Abrahao 
Abdelhamid Bouchachia 
Christian Koncilia 
Marek Lehmann 
Horst Pichler 
Domenico Lembo 
Enrico Bertini 
Stephen Kimani 
Monica Scannapieco 
Diego Milano 
Claudio Gennaro 
Giuseppe Amato 
Pasquale Savino 
Carlo Meghini 
Albrecht Schmidt 
Grzegorz Bartosiewicz 
Krzysztof Wecel 
Roberto Tedesco 
Matthias Beck 
Gerhard Bloch 
Claus Dziarstek 
Tobias Geis 
Michael Guppenberger 
Thomas Nitsche 
Petra Schwaiger 
Wolfgang Volkl. 

Abheek Anand 
Akhil Gupta 
Feng Peng 

Kapila Ponnamperums 
Lothar Rostek 
Holger Brocks 
Andereas Wombacher 
Bedik Mahleko 
Didier Nakache 
Nadira Lammari 
Tatiana Aubonnet 
Alain Couchot 
Cyril Labbe 
Claudia Roncancio 
Barbara Oliboni 
Bjorn Muschall 
Torsten Priebe 




XII External Reviewers 



Christian Schlager 

Andre Costi Nacul 

Carina Friedrich Dorneles 

Fabio Zschornack 

Mirella Moura Moro 

Renata de Matos Galante 

Vanessa de Paula Braganholo 

Andreas Flerzig 

Franck Morvan 

Yang Xiao 

Alex Cheung 

Vinod Muthusamy 

Daisy Zhe Wang 

Xu Zhengdao 

Thomais Pilioura 

Eleni Koutrouli 

George Athanasopoulos 

Anya Sotiropoulou 

Tsutomu Terada 

Jochen Kuester 

Tim Schattkowsky 

Marc Lohmann 

Flendrik Voigt 

Arne Ketil Eidsvik 

Geir Egil Myhre 

Mark Cameron 

Surya Nepal 

Laurent Lefort 

Chaoyi Pang 

K.S. Siddesh 

Flanyu Li 

Majed AbuSafiya 

Flo Lam Lau 

Qingzhao Tan 

James Cheng 

Jarogniew Rykowski 



Huiyong Xiao 
Lotfi Bouzguenda 
Jim Stinger 
Ren Wu 

Khalid Belhajjame 
Gennaro Brnno 
Fabrice Jouanot 
Thi Huong Giang Vu 
Trinh Tuyet Vu 
Yutaka Kidawara 
Kazutoshi Sumiya 
Satoshi Oyama 
Shinsuke Nakajima 
Koji Zettsu 
Gopal Gupta 
Campbell Wilson 
Maria Indrawan 
Agustinus Borgy Waluyo 
Georgia Koloniari 
Christian Kop 
Robert Grascher 
Volodymyr Sokol 
Wook-Shin Han 
Won-Y oung Kim 
George A. Mihaila 
Ioana R. Stanoi 
Takeshi Sagara 
Noriko Imafuji 
Shingo Ohtsuka 
Botao Wang 
Masayoshi Aritsugi 
Katsumi T akahashi 
Anirban Mondal 
Tadashi Ohmori 
Kazuki Goda 
Miyuki Nakano 




Table of Contents 



Workflow I 

Supporting Contract Execution through Recommended Workflows 1 

Roger Tagg, Zoran Milosevic, Sachin Kulkarni, and Simon Gibson 

An Ontology-Driven Process Modeling Framework 13 

Gianluigi Greco, Antonella Gnzzo, Luigi Pontieri, and Domenico Saccd 

Ensuring Task Dependencies During Workflow Recovery 24 

Indrakshi Ray, Tai Xin, and Yajie Zhu 

Web Service Based Architecture for Workflow Management Systems 34 

Xiaohui Zhao, Chengfei Liu, and Yun Yang 

Active and Deductive DB Aspects 

Feasibility Conditions and Preference Criteria in Querying and Repairing 

Inconsistent Databases 44 

Sergio Greco, Cristina Sirangelo, Irina Trubitsyna, and Ester Zumpano 

An Active Functional Intensional Database 56 

Paul Swoboda and John Plaice 

Optimal Deployment of Triggers for Detecting Events 66 

Manish Bhide, Ajay Gupta, Mukul Joshi, and Mukesh Molmnia 

A New Approach for Checking Schema Validation Properties 77 

Carles Farre, Ernest Teniente, and Toni Urpi 

Workflow II 

Autonomic Group Protocol for Peer-to-Peer (P2P) Systems 87 

Tomoya Enokido and Makoto Takizawa 

On Evolution of XML Workflow Schemata 98 

Fctbio Zschornack and Nina Edelweiss 

A Framework for Selecting Workflow Tools in the Context of Composite 

Information Systems 109 

Juan P. Carvallo, Xavier Franch, Carme Quer, and Nuria Rodriguez 




XIV Table of Contents 



Queries I (Multidimensional Indexing) 

Evaluation Strategies for Bitmap Indices with Binning 120 

Kurt Stockinger, Kesheng Wu, and Arie Shoshani 

On the Automation of Similarity Information Maintenance in Flexible Query 

Answering Systems 130 

Balctzs Csancid Csdji, Josef Kiing, Jurgen Palkoska, and Roland Wagner 

An Efficient Neighbor Searching Scheme of Distributed Collaborative Filtering 

on P2P Overlay Network 141 

Bo Xie, Peng Han, Fan Yang, and Ruimin Shen 

Applications 

Partially Ordered Preferences Applied to the Site Location Problem in Urban 
Planning 151 

Sylvain Lagrue, Rodolphe Devillers, and Jean-Yves Besqueut 

A Flexible Fuzzy Expert System for Fuzzy Duplicate Elimination in Data 

Cleaning 161 

Hamid Haidarian Shahri and Ahmad Abdolahzadeh Barforush 

DBSitter: An Intelligent Tool for Database Administration 171 

Adriana Carneiro, Romulo Passos, Rosalie Belian, Thiago Costa, 

Patricia Tedesco, and Ana Carolina Salgado 

Interacting with Electronic Institutions 181 

John Debenham 

Queries II (Multidimensional Indexing) 

Growing Node Policies of a Main Memory Index Structure for Moving Objects 

Databases 191 

Kyounghwan An and Bonghee Hong 

Optimal Subspace Dimensionality for £-NN Search on Clustered Datasets 201 

Yue Li, Alexander Thomasian, and Lijuan Zhang 

PCR-Tree: An Enhanced Cache Conscious Multi-dimensional Index Structures .... 212 
Young Soo Min, Chang Yong Yang, Jae Soo Yoo, Jeong Min Shim, 
and Seok II Song 




Table of Contents 



XV 



Knowledge Processing und Information Retrieval 

Classification Decision Combination for Text Categorization: 

An Experimental Study 222 

Yaxin Bi, David Bell, Hui Wang, Gongde Guo, and Werner Dubitzky 

Information Extraction via Automatic Pattern Discovery in Identified Region 232 

Liping Ma and John Shepherd 

CRISOL: An Approach for Automatically Populating Semantic Web from 

Unstructured Text Collections 243 

Roxana Danger, Rafael Berlanga, and Jose Rulz-Shulcloper 

Text Categorization by a Machine-Learning-Based Term Selection 253 

Javier Fernandez, Elena Montanes, Irene Diaz, Jose Ranilla, 
and Elias F. Combarro 

Queries III (XML) 

Update Conscious Inverted Indexes for XML Queries in Relational Databases 263 

Dong-Kweon Hong and Kweon-Yang Kim 

A Selective Key-Oriented XML Index for the Index Selection Problem in 

XDBMS 273 

Beda Christoph Hammerschmidt, Martin Kempa, and Volker Linnemann 

SUCXENT : An Efficient Path-Based Approach to Store and Query XML 

Documents 285 

Sandeep Prakash, Sourav S. Bhowmick, and Sanjay Madria 

Querying Distributed Data in a Super-Peer Based Architecture 296 

Zohra Bellahsene and Mark Roantree 

Digital Libraries and Information Retrieval I 

Phrase Similarity through the Edit Distance 306 

Manuel Vilares, Francisco J. Ribadas, and Jesus Vilares 

A Document Model Based on Relevance Modeling Techniques for 

Semi-structured Information Warehouses 318 

Juan Manuel Perez, Rafael Berlanga, and Maria Jose Aramburu 

Retrieving Relevant Portions from Structured Digital Documents 328 

Sujeet Pradhan and Katsumi Tanaka 




XVI Table of Contents 



Query IV (OLAP) 

Parallel Hierarchical Data Cube for Range Sum Queries and Dynamic Updates 339 

Jianzhong Li and Hong Gao 

A Bitmap Index for Multidimensional Data Cubes 349 

Yoonsnn Lim and Myung Kim 

Analytical Synopses for Approximate Query Answering in OLAP Environments.. 359 
Alfredo Cuzzocrea and Ugo Matrangolo 

Digital Libraries and Information Retrieval II 

Morphological and Syntactic Processing for Text Retrieval 371 

Jesus Vilares, Miguel A. Alonso, and Manuel Vilares 

Efficient Top-£ Query Processing in P2P Network 381 

Yingjie He, Yanfeng Shu, Shan Wang, and Xiaoyong Du 

Improved Data Retrieval Using Semantic Transformation 391 

Barry’ G. T. Lowden and Jerome Robinson 

A Structure-Based Filtering Method for XML Management Systems 401 

Olli Luoma 

Mobile Information Systems 

Uncertainty Management for Network Constrained Moving Objects 411 

Zhiming Ding and Ralf Hartmut Giiting 

Towards Context-Aware Data Management for Ambient Intelligence 422 

Ling Feng, Peter M.G. Apers, and Willem Jonker 

TriM: Tri-Modal Data Communication in Mobile Ad-Hoc Networks 432 

Leslie D. Fife and Le Gruenwald 

Knowledge Processing I 

Efficient Rule Base Verification Using Binary Decision Diagrams 445 

Christophe Mues and Jan Vanthienen 

How to Model Visual Knowledge: A Study of Expertise in Oil-Reservoir 

Evaluation 

Mara Abel, Laura S. Mastella, Lins A. Lima Silva, John A. Campbell, 
and Luis Fernando De Ros 



455 




Table of Contents XVII 



A New Approach of Eliminating Redundant Association Rules 465 

Mafruz Zaman Ashrafi, David Taniar, and Kate Smith 

An a Priori Approach for Automatic Integration of Heterogeneous and 

Autonomous Databases 475 

Ladjel Bellatreche, Guy Pierra, Dung Nguyen Xuan, Dehainsala Hondjack, 
and Yamine Ait Ameur 

Knowledge Processing II 

PC -Filter: A Robust Filtering Technique for Duplicate Record Detection in 

Large Databases 486 

Ji Zhang, Tok Wang Ling, Robert M. Bruckner, and Han Liu 

On Efficient and Effective Association Rule Mining from XML Data 497 

Ji Zhang, Tok Wang Ling, Robert M. Bruckner, A Min Tjoa, and Han Liu 

Support for Constructing Theories in Case Law Domains 508 

Alison Chorley and Trevor Bench-Capon 

Knowledge Processing III 

Identifying Audience Preferences in Legal and Social Domains 518 

Paid E. Dunne and Trevor Bench-Capon 

Characterizing Database User's Access Patterns 528 

Qingsong Yao and Aijun An 

Using Case Based Retrieval Techniques for Handling Anomalous Situations 

in Advisory Dialogues 539 

Marcello L ’Abbate, Ingo Frommholz, Ulrich Thiel, and Erich Neuhold 

A Probabilistic Approach to Classify Incomplete Objects Using Decision Trees.... 549 
Lamis Hawarah, Ana Simonet, and Michel Simonet 

XML I 

A Graph-Based Data Model to Represent Transaction Time in Semistmctured 

Data 559 

Carlo Combi, Barbara Oliboni, and Elisa Quintarelli 

Effective Clustering Schemes for XML Databases 569 

William M. Shui, Damien K. Fisher, Frank y Lam, and Raymond K. Wong 




XVIII Table of Contents 



Detecting Content Changes on Ordered XML Documents Using Relational 

Databases 580 

Erwin Leonardi, Sourav S. Bhowmick, T.S. Dharma, and Sanjay Madrid 

Timestamp-Based Protocols for Synchronizing Access on XML Documents 591 

Sven Helmer, Carl-Christian Kanne, and Guido Moerkotte 

Distributed and Parallel Data Bases I 

On Improving the Performance Dependability of Unstructured P2P Systems 

via Replication 601 

Anirban Mondal, Yi Lifu, and Masaru Kitsuregawa 

Processing Ad-Hoc Joins on Mobile Devices 611 

Eric Lo, Nikos Mamoulis, David W. Cheung, Wai Shing Ho, and Panos Kalnis 

Preserving Consistency of Dynamic Data in Peer-Based Caching Systems 622 

Song Gao, Wee Siong Ng, and Weining Qian 

Efficient Processing of Distributed Iceberg Semi-joins 634 

Mohammed Kasim Imthiyaz, Dong Xiaoan, and Panos Kalnis 

Advanced Database Techniques I 

Definition of Derived Classes in ODMG Databases 644 

Eladio Garvi, Jose Samos, and Manuel Torres 

Applying a Fuzzy Approach to Relaxing Cardinality Constraints 654 

Harith T. Al-Jumaily, Dolores Cuadra, and Paloma Martinez 

In Support of Mesodata in Database Management Systems 663 

Denise de Vries, Sally Rice, and John F. Roddick 

Distributed and Parallel Data Bases II 

Moderate Concurrency Control in Distributed Object Systems 675 

Yousuke Sugiyama, Tomoya Enokido, and Makoto Tcikizawa 

Performance Evaluation of a Simple Update Model and a Basic Locking 

Mechanism for Broadcast Disks 684 

Stephane Bressan and Guo Yuzhi 




Table of Contents XIX 



Adaptive Double Routing Indices: Combining Effectiveness and Efficiency 

in P2P Systems 694 

Stephane Bressan, Achmad Nizar Hidayanto, Chu Yee Liau, 
and Zainal A. Hasibuan 

Advanced DB Techniques II 

Efficient Algorithms for Multi-file Caching 707 

Ekow J. Otoo, Doron Rotem, and Sridhar Seshadri 

A System for Processing Continuous Queries over Infinite Data Streams 720 

Ehsan Vossough 

Outer Join Elimination in the Teradata RDBMS 730 

Ahmad Ghazal, Alain Crolotte, and Ramesh Bhashyam 

Formalising Software Quality Using a Hierarchy of Quality Models 741 

Xavier Burgues Ilia and Xavier Franch 

Bioinformatics 

RgS-Miner: A Biological Data Warehousing, Analyzing and Mining System 

for Identifying Transcriptional Regulatory Sites in Human Genome 751 

Yi-Ming Sun, Hsien-Da Huang, Jorng-Tzong Horng, Ann-Ping Tsou, 
and Shir-Ly Huang 

Effective Filtering for Structural Similarity Search in Protein 3D Structure 

Databases 761 

Sung Hee Park and Keun Ho Ryu 

Fast Similarity Search for Protein 3D Structure Databases Using Spatial 

Topological Patterns 771 

Sung Hee Park and Keun Ho Ryu 

Ontology-Driven Workflow Management for Biosequence Processing Systems .... 781 
Melissa Lemos, Marco Antonio Casanova, Luiz Fernando Bessa Seibel, 

Jose Antonio Fernandes de Macedo, and Antonio Basiiio de Miranda 

XML II 

Towards Integration of XML Document Access and Version Control 791 

Somchai Chatvichienchai, Chutiporn Anutariya, Mizuho Iwiahara, 

Vilas Wuwongse, and Yahiko Kambayashi 




XX Table of Contents 



Prefix Path Streaming: A New Clustering Method for XML Twig Pattern 

Matching 801 

Ting Chen, Tok Wang Ling, and Chee-Yong Chan 

A Self-adaptive Scope Allocation Scheme for Labeling Dynamic XML 

Documents 811 

Yun Shen, Ling Feng, Tao Shen, and Bing Wang 

F2/XML: Navigating through Linked XML Documents 822 

Lina Al-Jadir, Fatme El-Moukaddem, and Khaled Diab 

Temporal and Spatial Data Bases I 

Declustering of Trajectories for Indexing of Moving Objects Databases 834 

Youngduk Seo and Bonghee Hong 

Computing the Topological Relationship of Complex Regions 844 

Markus Schneider 

A Framework for Representing Moving Objects 854 

Ludger Becker, Henrik Blunck, Klaus Hinrichs, and Jan Vahrenhold 

Temporal Functional Dependencies with Multiple Granularities: 

A Logic Based Approach 864 

Carlo Combi and Rosalba Rossato 

Web I 

Device Cooperative Web Browsing and Retrieving Mechanism on Ubiquitous 

Networks 874 

Yutaka Kidawara, Koji Zettsu, Tomoyuki Uchiyama, and Katsumi Tanaka 

A Flexible Security System for Enterprise and e-Government Portals 884 

Torsten Priebe, Bjorn Muschall, Wolfgang Dobmeier, Gunther Pemul 

Guiding Web Search by Third-Party Viewpoints: Browsing Retrieval Results 

by Referential Contexts in Web 894 

Koji Zettsu, Yutaka Kidawara, and Katsumi Tanaka 



Temporal and Spatial Data Bases II 



Algebra-to-SQL Query Translation for Spatio-temporal Databases 
Mohammed Minout and Esteban Zimdnyi 



904 




Table of Contents XXI 

Visualization Process of Temporal Data 914 

Chaouki Daassi, Laurence Nigay, and Marie-Christine Fauvet 

XPQL: A Pictorial Language for Querying Geographical Data 925 

Fernando Ferri, Patrizia Grifoni, and Maurizio Rafanelli 

Web II 

HW-STALKER: A Machine Learning-Based Approach to Transform Hidden 

Web Data to XML 936 

Vladimir Kovalev, Sonrav S. Bhowmick, and Sanjay Madria 

Putting Enhanced Hypermedia Personalization into Practice via Web Mining 947 

Eugenio Cesario, Francesco Folino, and Riccardo Ortale 

Extracting User Behavior by Web Communities Technology on Global Web 

Logs 957 

Shingo Otsnka, Masashi Toyoda, Jun Hirait, and Masaru Kitsuregawa 

Author Index 969 




Supporting Contract Execution through 
Recommended Workflows 



Roger Tagg 1 , Zoran Milosevic 2 , Sachin Kulkami 2 , and Simon Gibsoir 



'University of South Australia, School of Computer and Information Science 
Mawson Lakes, SA 5095, Australia 
Roger . Tagg@unisa . edu . au 
2 CRC for Enterprise Distributed Systems Technology (DSTC) 

Level 7, GP South, University of Queensland, 

Brisbane, Q 4072, Australia 
{ zoran, sachink, sgibson}@dstc . edu . au 



Abstract. This paper extends our previous research on e-contracts by investi- 
gating the problem of deriving business process specifications from business 
contracts. The aim here is to reduce the risk of behaviour leading to contract 
violations by encouraging the parties to a contract to follow execution paths that 
satisfy the policies in the contract. Our current contract monitoring prototype 
provides run-time checking of policies in contracts. If this system was linked to 
workflow systems that automate the associated business processes in the con- 
tract parties, a finer grain of control and early warning could be provided. We 
use an example contract to illustrate the different views and the problems of de- 
riving business processes from contracts. We propose a set of heuristics that can 
be used to facilitate this derivation. 



1 Introduction 



Most business transactions are based on a contract of some form. However, in most of 
today’s organizations, including their IT systems support, contracts are treated as iso- 
lated entities, far removed from their essential role as a governance mechanism for 
business transactions. This can lead to many problems, including the failures to detect 
in timely manner and react to business transaction events that could result in contract 
violations or regulatory non-compliance. 

As a result, several vendors have begun offering self-standing enterprise contract 
management software [2][3][4][6]. These systems consist mostly of a number of pre- 
built software components and modules that can be deployed to specific contract re- 
quirements. However our earlier work [7][8] suggests that a more generic approach is 
needed that more closely reflects contract semantics, in particular in terms of the gov- 
ernance role. This means adopting higher level modelling concepts that directly re- 
flect the business language of a contract and the policies that express constraints on 
the parties involved. Examples of these are obligations, permissions, prohibitions, au- 



F. Galindo et al. (Eds.): DEXA 2004, LNCS 3180, pp. 1 - 12, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 



2 



Roger Tagg et al. 



thorisation etc. This implies a need for specialised languages to express these contract 
semantics. 

In previous papers we presented our language-based solution for the expression of 
contract semantics in a way suitable for the automation of contract monitoring [7] [8]. 
This language, Business Contract Language (BCL), is used to specify monitoring 
conditions that can be then interpreted by a contract engine. This paper investigates to 
what extent the semantics of contracts can be used to infer business processes which, 
if followed by the trading partners, would help reduce the risks associated with con- 
tract non-compliance. Such processes may be able to provide a finer grain of monitor- 
ing to complement that achievable through the BCL alone. We refer to these business 
processes as ‘recommended’ business processes - to reflect the fact that they can only 
be a guiding facility for managing activities related to contracts, and that the different 
parties’ organisational policies and cultures may impose limitations on how far busi- 
ness processes can be structured or how strictly they should be mandated. 

The paper begins with a motivating example of a water supply maintenance situa- 
tion that could benefit from the automation of contract related activities. In the subse- 
quent section we describe how BCL can be used to express monitoring conditions for 
this system. We then present a model of the same contract seen as a business process, 
following which we discuss the problems associated with the translation of contract 
conditions into a business process, referring to the lessons we learned in trying this. 
Next we present a proposed approach for derivation of business process from a con- 
tract, based on heuristics related to different types of contract and clause. This is fol- 
lowed with an overview of related work. The paper concludes with a summary of ar- 
eas for future research and a brief conclusion. 



2 Motivating Example 

In this fictitious example, Outback Water (OW) is a utility organisation that provides 
water to agriculture, industry (primarily mining and oil/gas extraction) and small 
towns in certain central parts of Australia. It operates some storage lakes and both 
open irrigation canals and pipelines. 

OW makes contracts with maintenance subcontractors for servicing and maintain- 
ing its assets (e.g. pumps, valves, etc) located in its facilities in various areas. The 
contracts are of a repetitive and potentially continuing nature. Contracts are for a year 
and cover a list of assets that have to be maintained. 

From the point of view of OW's service to its customers, its Quality of Service 
(QoS) objective is to ensure that the average and worst loss of service to any customer 
is within a stated maximum number of days. OW uses MTBF (Mean Time Between 
Failures) and MTTR (Mean Time To Repair) as its main measures of asset 
availability. 



Supporting Contract Execution through Recommended Workflows 



3 



The contract is summarised in the following table: 

Table 1 . Representation of the contract between Outback Water and a Maintenance 
Subcontractor 





Obligations: Subcontractor 


si 


Make its best efforts to ensure that the following QoS conditions are met: 

- not exceed the maximum asset down time on any one asset 

- not exceed the call-out time limit on more than 5% of emergencies in a month 

- average above the specified MTBF and below the MTTR over a month 
The maximum or minimum values are provided in a schedule to the contract. 


s2 


Submit monthly reports on all preventative maintenance activities and emer- 
gency events, including full timing details and description of problems and ac- 
tion taken, broken down into labour, replacement parts and materials. 


s3 


Inform the asset operator within 24 hours of any event that might affect the abil- 
ity to achieve the quality of service, e.g. resignation of subcontractor engineers, 
recurring problem with certain asset types 


s4 


Submit monthly invoices of money due to the subcontractor. 




Obligations: Asset Operator (OW) 


owl 


Pay the subcontractor on monthly invoice within 30 days. 


ow2 


Provide list of assets to be maintained, with clear instructions of the maintenance 
cycles required (asset lists are in a schedule to the contract, maintenance manuals 
are in associated paper or on-line documents) 


ow3 


Provide clear MTBF and MTTR targets 


ow4 


Feed back to the subcontractor any information received about problems with the 
water supply, including emergencies reported by its customers within 24 hours 


ow5 


Give the subcontractor access to all the asset sites. 


ow6 


After each of the 1 st and 2 nd quarters, give guidance to the subcontractor on how 
any shortcomings in the service might be improved. 




Permissions: Asset Operator 


ow7 


May take on an additional subcontractor in the event that the appointed subcon- 
tractor is having difficulty in meeting the QoS targets. 


ow8 


After the 3 rd quarter of the contract, may give the subcontractor notice to quit or 
to be asked to continue for another year 




Prohibitions: Subcontractor 


s5 


Not allowed to re-assign maintenance tasks to a sub-sub-contractor. 



3 Expressing Contract Monitoring Conditions Using BCL 

BCL is a language developed specifically for the purpose of monitoring behaviour of 

parties involved in business contracts. Key concepts of this language are [7]: 

• Community - A container for roles and their relationships in a cross-enterprise 
arrangement. A Community may be instantiated from a Community Template. 

• Policy - General constraints on behaviour for a community, expressed as permis- 
sions, prohibitions, obligations or authority. In combination these make up the 
terms of the contract. 



4 



Roger Tagg et al. 



• State - information containing the value of variables relevant to the community; 
may change in respect to events or time. 

• Event - any significant occurrence generated by the parties to the contract, an ex- 
ternal source, or a temporal condition. 

• Event Pattern - an expression relating two or more events that can be detected 
and used in checking compliance with a policy (see [5] for similar concepts). 

The BCL concepts introduced above can be used to express a model for a specific 

business contract, such as that between OW and a sub-contractor. These models are 
then interpreted by a contract engine, to enable evaluation of actual contract execution 
versus agreed contract terms. This evaluation requires access to the contract-related 
data, events and states as they change during the execution of business processes. 

In the water supply example there are a number of clauses that are suitable for run 
time monitoring, but for brevity we choose only the clauses under si in Table 1. The 
contract should have a schedule describing each asset and the availability objectives 
associated with that asset. As part of the contract the sub-contractor must submit a 
monthly report outlining all tasks performed whether routine maintenance or emer- 
gency repairs. This report should contain basic details that will be used to calculate 
adherence to the QoS metrics. The report will need to identify the asset, and contain a 
description of the task, the start time and the finish time. In addition to this, for any 
emergency task the actual time of failure should be indicated. 




Fig. 1 . BCL concepts for part of a water supply maintenance contract 



Supporting Contract Execution through Recommended Workflows 



5 



To begin with, an overall community should be defined for the entire contract. In 
this example, each asset has some of its own monitoring behaviour and so each asset 
can be seen as being a sub-community of the overall community. Fig. 1 outlines some 
of the required constructs. The full lines indicate event flow and the broken lines indi- 
cate data flow. 

The parent community template defines an event creation rule (ECR) that extracts 
each task from a SubContractorMonthlyReport event and passes the task as an event 
to the associated sub-community instance. There are a number of States that collect 
these events and perform an associated calculation. Policies can then use the value of 
these states as well as values defined in the contract schedule to determine whether 
constraints have been met or violated. The trigger for evaluating these Policies is 
when an event indicating the EndOfMonth is received. It should be noted that a Guard 
is placed on most of the Policies declaring that the SubContractorMonthlyReport must 
be received prior to the EndOfMonth event. Additional Policies could be used to en- 
force this behaviour but is not shown here for reasons of brevity. Notifications are 
used to notify human users that a violation has occurred. Table 2 provides BCL syn- 
tax for the specification of one fragment of the contract, namely asset downtime state, 
policies and notifications. 



Table 2. BCL syntax examples for asset downtime specifications 



EventCreationRule: AssetTaskReport 
GenerateOn : SubContractorMonthlyReport 
ContentToGenerate: 

Loop through report and create an 
AssetTaskReport for each task 


State: downtimeState 

On event: AssetTaskReport 
If its an emergency task calculate the 
total downtime and add it to total 
total = Total + (FinishDateTime - TimeOfFailure) 


Policy: downtimeLimit 

Guard: SubContractorMonthlyReport 
On event: EndOfMonthVerification 
Checks if downtimeState value is greater than 
the defined value of MaxAssetDowntime metric 


Notification: downtimeLimitNotification 

On event: downtimeLimitPolicyEvaluationEvent 



Note that although we use an event-driven approach for the monitoring, these are 
infrequent events and this contract can be characterised as a system-state invariant 
contract. For more information about various characteristics of contract clauses, see 
their classification in section 6. 



4 Deriving Business Processes from Contracts 

A contract exists for a limited purpose - to express constraints on the behaviour of 
signatories with the aim of achieving their individual objectives in the presence of un- 
certainty. It does not attempt to prescribe the “how” of a business process; rather, it is 
limited to what conditions that need to be satisfied for the parties to comply with the 
contract. In practice, in order to ensure that a contract is satisfied, the parties - sepa- 
rately and together - must have processes (which may be in part informal) for meet- 
ing their obligations under the contract. 



6 



Roger Tagg et al. 



A formal workflow might be able to add the following to contract management: 

• Guidance to the human participants in each contract party, particularly where the 
staff involved are not experienced in the pattern of collaboration - answering 
“what do we do next?” 

• Auditing: answering “who actually performed the constituent activities?” in case 
of a breach in the contract 

• Early warning: if activities in either party are behind schedule at a detailed level, 
it may be possible to re-assign resources to remedy this. 




Fig. 2. A business process for the water supply maintenance example 



A business process will generally be at a finer level of detail than the contract 
clauses. When trying to derive business processes for the water supply example (see 
Figure 2), we needed to introduce a number of assumptions which were not explicitly 
stated in the contract. Examples of such “introduced” behaviour are activities such as 
Issue Work Order, Amend Work Order and Prepare Resolution plan. Another exam- 
ple is the activity: “the sub-contractor can be given notice” which implies that the as- 
set owner must review performance against the contract. This finding is in line with 
our previous experience reported in [1] and is a result of the fact that the contract only 
















Supporting Contract Execution through Recommended Workflows 



7 



states a broad framework of possible executions and that many behaviour trajectories 
can satisfy the policies stated in the contract. 

Therefore, the business process in Figure 2 is one possible way to satisfy the poli- 
cies in this contract. The example also shows the separation of processes across OW 
and the sub-contractor; two levels of nesting for the month and quarter periods; and 
two repeating activities (problem and work order sub-processes). In addition it shows 
a need for supporting external events (not originating from within the workflow). 

Once this process is in place, it would be then possible to use the events generated 
through the corresponding workflow system as input to a contract monitoring system, 
such as one that utilizes BCL and the underlying interpreter engine [7]. This figure 
highlights possible points where contract monitoring conditions can be applied 
(shown using the black BCL symbol). 



5 Discussion Points 

There are a number of considerations that we faced when working with this example. 
Some of the key questions and possible solutions are outlined in this section. 



5.1 The Feasibility of Deducing Business Processes 

How many activities are deducible by understanding the nature of the contract 
clauses? How much dependency between activities is explicitly stated in a contract? 

In our analysis we found that, although we started from the same natural language 
specification as described above, the questions we had to ask for the two models, 
namely contract monitoring and workflow, were quite different. Several of the activi- 
ties that the subcontractor should perform were not mentioned in the contract. We can 
deduce, for example, that the subcontractor must send a monthly report (and invoice) 
and inform OW immediately of any noteworthy problems. But it is not prescribed that 
the subcontractor must make a monthly plan and create work orders, or that they 
should revise the work orders following an emergency. 

It is difficult to envisage any general rules that could be applied to all types of con- 
tracts and clauses. 



5.2 Inter-organizational Workflow Versus Separate Workflows 

Supposing we can deduce a recommended business process, how can it be usefully 
expressed? 

One possibility is to propose a single inter-enterprise workflow. However this is 
not likely to be politically acceptable unless the parties to the contract have a very 
high level of mutual trust, and are not so concerned about their autonomy. 

An alternative is to offer the workflow in two separate sections, one for each party, 
showing where they need to interact with the other, as in BPEL [10]. However highly 



8 



Roger Tagg et al. 



autonomous parties may still object to too much detailed prescription, and may al- 
ready have their own workflow patterns for performing services of this type. A water 
system maintenance subcontractor might have, for example, worked on maintenance 
for other clients. In this case, it would be better to leave the finer process detail as 
“black boxes” for each party’s managers to decide themselves. 

It is really a part of contract negotiation to agree what level of integration of proc- 
ess and data the parties to a contract will subject themselves to. If cooperation needs 
to be close however, too little control might not be adequate. Someone in one organi- 
sation may need to ask the other “where exactly are you on this?” In such cases it 
could be desirable for each company to allow some inspection of their own local 
workflow status by their collaborators. 



5.3 Dependence on Data Capture 

How do we verify that data relating to the contract clauses is reliably captured? For 
example, are there remotely readable real time meters on the pumps, or does OW 
have to rely on the subcontractor? How do we know that the subcontractor’s engineer 
has properly serviced a pump, or that the required report contains at least the pre- 
scribed minimum details? 

Verification of completion of activities is not necessarily assured by simply auto- 
mating the workflow or the contract monitoring. Many workflow management sys- 
tems (WfMS) allow a performer to simply click a link that says “Completed”. 

In obligations and prohibitions, and in the effectiveness of the granting of permis- 
sions, how do we monitor non-compliance? In our example, how does anyone find 
out if a sub-sub-contractor has been called in discreetly, or that a key to an installation 
has not been provided? There are cases where it may be against the interests of one 
party to reveal the data. 

In general, if contract clauses rely critically on the values of captured data, then a 
loop to verify those data may be needed. This can be added as an additional element 
in the recommended process. 



5.4 Overriding the Contract Due to Force Majeure 

A further question is, what happens if the contract itself has to be altered? An exam- 
ple might be a drought that caused a systematic failure in many pumps due to impuri- 
ties. Such overrides would need to be reflected in any running workflows as well as 
the contract. If the contract is subject to such alteration, it would imply the need for 
any software supporting a workflow implementation of the business processes to al- 
low easy adaptation of the workflow template at run time. 



Supporting Contract Execution through Recommended Workflows 



9 



6 Proposals 

This section provides a number of proposals to assist in deriving recommended busi- 
ness processes based on the contract expressions. They represent our early ideas to 
this mapping problem and will need to be further elaborated in our future research. 



6.1 Analysis of the Types of Contracts and Contract Clauses Involved 

Contract clauses vary a lot in style - and the contracts as a whole in their balance of 

these. The following classification is suggested, based on previous examples in the e- 

commerce area and the example we are currently using. 

• System state invariant - this means that certain measurements on the real world 
situation must not be allowed to go outside certain bounds at any time. In our ex- 
ample this measurement could be the MTBF, MTTR and total down time. A pro- 
cedure must exist within the party responsible for the system state for achieving 
this. In our case there has to be a maintenance plan, scheduling when each pump 
is going to be maintained. If the subcontractor falls behind on its work, then the 
impact on the contract may not be immediate. The MTBF and MTTR figures will 
only show up when they are next re-calculated and reported. Depending on the 
data, it may be possible to provide early warning of likely failure to meet the re- 
quirement 

• Deadline - this means that some event must occur by a certain date (usually rela- 
tive to a starting point or a previous event). In our case study, examples are sub- 
mission of a monthly report, and of additional events and feedback in both direc- 
tions. Early warning may be possible if the activities can be broken down into 
smaller measurable stages 

• Event-dependent - this implies that some activity must occur following some 
specified event. In our example, the event could be an emergency in which an ir- 
rigation pump for a critical crop failed. In an e-business contract, the event could 
be the placing of an order. 

• Artefact quality - for the contract to succeed, this implies an inspection stage, 
which may be followed by an iterative re-work loop. The artefact may be physi- 
cal (e.g. delivered goods) or informational (e.g. a report or design). 

• Nested - some contracts are at a single level, e.g. the once-off supply of a number 
of a particular product. More often the contract has multiple instances, in possibly 
more than one dimension. In our case study, we have multiple assets. Many other 
contracts cover multiple business cases, repeated orders etc. This implies proc- 
esses at both the individual level and at the overall contract level. 

• Periodic - some contracts are for a single instance of some activity, others are 
subject to regular calendar-based repetition, including our own example. There- 
fore there are processes that repeat within each calendar period. 

• Exception specification - this explicitly states a process that is to be followed if 
things go wrong. In our example, OW can terminate the contract after the 3 rd 



10 



Roger Tagg et al. 



quarter. In other cases, there may be penalty clauses, procedures for agreeing ex- 
tensions and so on. It often makes sense to provide prompting to parties to a con- 
tract that they should enforce their rights. 



6.2 Heuristic Rules for Deriving Recommended Sub-workflows 

While it is possible for contract architecture and business process model to be derived 
independently - as we have done - there does seem to be the opportunity for recom- 
mending a set of heuristic rides that may help to suggest the structure of the recom- 
mended workflow, based on clause characteristics. 

The following table shows a summary of the heuristics that could be applied: 



Table 3. Summary of suggested heuristics 



Heuristic 


Contract types 


Deontic modality 


Comments 


Introduce escala- 
tion branches (pen- 
alties, extensions 
etc.) 


Exception 


Obligations, Per- 
missions, Prohibi- 
tions 


This is the easiest to de- 
rive, as the process is usu- 
ally explicit in the contract 


Introduce sub-pro- 
cesses for activities 
inside the nesting 
or periodicity 


Nested, Periodic 


All 


Progress on the individual 
business cases, or periods, 
is the best early warning 


Introduce loops for 
checking the deliv- 
erable and iterating 
to achieve quality 


Quality 


Obligations 


The requestor may want to 
reserve the right not to ac- 
cept the completion of the 
service. 


Introduce planning 
activities corres- 
ponding to a re- 
quired level of 
performance 


Status 


Obligations, Per- 
missions 


The party requesting the 
service may want to be 
confident that the subcon- 
tractor has adequate re- 
sources and procedures to 
meet the requirement 


Introduce a rene- 
gotiation phase in 
case the contract 
needs changing 


Nested, Periodic 


All 


If things don’t go right in 
one period, or on one 
business case, the parties 
may want to allow adjust- 
ment of the contract proc- 
ess itself 


Introduce related 
reporting and other 
information flow 
phases 


All except excep- 
tions 


Obligations 


If required performance is 
specified, but no reporting 
activity, then this should 
be added 



Supporting Contract Execution through Recommended Workflows 1 1 



6.3 Introduction of Additional “Accepted Practice” Sub-workflows 

Some parts of widely-used business processes are available for re-use within some of 
the well-known workflow management systems, e.g. Action Works Metro [9]. Typi- 
cal examples are getting feedback from a number of people on a draft document, or 
common business applications such as invoice/payment. Such business processes 
could be considered to be used as a potential solution for implementing certain proc- 
esses that satisfy contract conditions. 



6.4 Cross-Checking of Business Process Models 

As discussed earlier, the parties to a contract may wish to tailor any recommended 
workflows to meet their internal organisation culture, or they may already have their 
own workflows. Another approach is to analyse the difference between the process 
models of the individual parties and the “recommended” model. As we found from 
our own experience, even deriving a recommended model can introduce the possibil- 
ity of inconsistency with the BCL model, so cross checking is also needed here. In our 
own example, we can highlight the fact that there is no explicit measuring of the call- 
out time in the process model. We may allow this to be included by the subcontractor 
in “Perform Maintenance/Repair”, or we may feel that this does not encourage the 
call-out time to be reliably captured. 



7 Related Work 

Very few researchers have addressed the relationship between contracts and work- 
flow. In the paper of Van den Heuvel and Weigand [11] and in the European Cross- 
Flow project [12], contracts are introduced as a means of coordinating, at a higher 
level, the workflows of individual organisations in a B2B environment. In the com- 
mercial field Dralasoft, a vendor of component workflow software, has recently 
(23/02/04) announced a link with diCarta [13], but further details are not yet known. 
We believe our approach is currently unique in trying to re-use contract information 
to infer the workflows that should exist within and between the parties. 



8 Future Work and Conclusions 

To further this work a natural follow on is to analyse a larger number of contracts to 
examine whether there are some other clause/contract characteristics and come up 
with a more comprehensive classification of contracts. This would also help identify 
possible further patterns which would suggest heuristics for deriving business proc- 
esses from natural language expression of contracts. Until this is done, we believe that 
it is premature to develop software approaches for this derivation, such as intelligent 
agents which could be used for building knowledge bases containing suitable deriva- 



12 Roger Tagg et al. 



tion heuristics. Further, the development of tools for cross checking between the BCL 
and process models is also dependent on a greater understanding of the variety of con- 
tract clause types. Another problem is to what extent derived workflows can be feed- 
back into the contract negotiation. 

It is worth noting that our original hypothesis was that it may be possible to trans- 
late a business contract expressed in a language such as BCL into a business process 
language that could be used in a workflow management system, but this did not prove 
to be realistic. This research found that the types of contract, and the nature of the 
politics between and within the parties to a contract, were too variable. We have pro- 
posed a set of heuristics that can help guide the design of recommended workflows 
that coidd guide parties to implement contract-compliant behaviour. 



Acknowledgements 

The work reported in this paper has been funded in part by the Co-operative Research 
Centre for Enterprise Distributed Systems Technology (DSTC) through the Australian 
Federal Government's CRC Programme (Department of Industry, Science & Re- 
sources). 

The authors would also like to thank to Dr Michael Lawley and Tony O’Hagan for 
their comments to an earlier version of this paper. 



References 

[1] Z. Milosevic, G. Dromey, On Expressing and Monitoring Behaviour in Contracts, 
EDOC2002 Conference, Lausanne, Switzerland 

[2] iMany, http://www.imany.com 

[3] DiCarta, http://www.dicarta.com 

[4] UpsideContracts, http://www.upsidecontract.com 

[5] D. Luckham, The Power of Events, Addison-Wesley, 2002 

[6] Oracle Contracts, http://www.oracle.com/appsnet/products/ contracts/content.html 

[7] P. Linington, Z. Milosevic, J. Cole, S. Gibson, S. Kulkami, S. Neal, A unified behav- 
ioural model and a contract for extended enterprise. Data Knowledge and Engineering 
Journal, Elsevier Science, to appear. 

[8] S. Neal, J. Cole, P.F. Linington, Z. Milosevic, S. Gibson, S. Kulkami, Identifying re- 
quirements for Business Contract Language: A Monitoring Perspective, IEEE 
EDOC2003 Conference Proceedings, Sep 03. 

[9] Action Technologies, Inc, http://www.actiontech.com 

[10] http://www-106.ibm.com/developerworks/librarv/ws-bpel/ 

[11] van den Heuvel, W-J and Weigand, H "Cross-Organizational Workflow Integration using 
Contracts" http://ieffsutherland.org/oopsla2000/vandenheuveFvandenheuvel.htm 

[12] Damen, Z, Derks, W, Duitshof, M and Ensing, H "Business-to-Business E-commerce in a 
Logistics Domain" http://www.crossflow.org/ link to Publications 

[13] Dralasoft Inc Press Release, http://www.dralasoft.com/news/dicarta.html 



An Ontology-Driven Process Modeling 
Framework 



Gianluigi Greco 1 , Antonella Guzzo 1 , Luigi Pontieri 2 , and Domenico Sacca 1,2 



1 DEIS, University of Calabria, Via Pietro Bucci 41C, 87036 Rende, Italy 
2 ICAR, CNR, Via Pietro Bucci 41C, 87036 Rende, Italy 
{ggreco ,guzzo}@si . deis .unical . it, {pontieri , sacca}@icar . cnr . it 



Abstract. Designing, analyzing and managing complex processes are 
recently become crucial issues in most application contexts, such as e- 
commerce, business process (re-)engineering, Web/grid computing. In 
this paper, we propose a framework that supports the designer in the 
definition and in the analysis of complex processes by means of sev- 
eral facilities for reusing, customizing and generalizing existent process 
components. To this aim we tightly integrate process models with a do- 
main ontology and an activity ontology, so providing a semantic vision of 
the application context and of the processes themselves. Moreover, the 
framework is equipped with a set of techniques providing for advanced 
functionalities, which can be very useful when building and analyzing 
process models, such as consistency checking, interactive ontology nav- 
igation, automatic (re) discovering of process models. A software archi- 
tecture fully supporting our framework is also presented and discussed. 
Keywords: Process Modeling, Mining, Inheritance, Workflows, 

Ontologies. 



1 Introduction 

Process modeling has been addressed for decades and a lot of frameworks were 
proposed in several research fields, like Workflow Systems and Business Pro- 
cesses. This topic is a subject of interest in novel and attractive areas (e.g., 
Web/grid computing, e-commerce and e-business [4,6]), where customization 
and reuse issues play a crucial role. 

In this paper we devise a framework which supports designers in the definition, 
analysis and re-engineering of process models in complex and dynamic contexts. 
The main goal of our approach is to fully exploit the experience gained by 
designers over time, and somehow encoded in the process models defined so far. 
In order to support reuse, customization and semantic consolidation, process 
models are integrated into an ontological framework, which encompasses the 
description of the entities involved in the processes (e.g. activities and associated 
input/output parameters). Moreover, in order to make easier the exploitation 
of design knowledge, we use specialization/inheritance relationships to organize 
process models into taxonomies, which can sensibly reduce the efforts for reusing 
and customizing a model. 



F. Galindo et al. (Eds.): DEXA 2004, LNCS 3180, pp. 13-23, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 



14 



Gianluigi Greco et al. 



The exploitation of ontologies for describing process models and reasoning on 
them is not new. For example, domain and task ontologies are extensively used 
by Semantic Web Services approaches (see, e.g., [2]), which are mainly devoted 
to automatic execution issues, rather than to exploiting design knowledge. Con- 
versely, Business Engineering approaches (see, e.g., [10]) focus on structuring 
design knowledge through process taxonomies, but typically give little attention 
to the specification of the execution flows. Execution flows can be effectively ex- 
pressed, through, e.g., one of the formalisms adopted in Workflow Management 
Systems ( WFMS ). Interestingly, inheritance of workflow models was investi- 
gated in [11], but principally with respect to adaptiveness and dynamic change 
issues, involving, e.g., the migration of executions produced by several variants 
a given workflow. As a consequence, the approach relies on a formal notion of 
inheritance, focused on behavioral features and specifically defined for workflow 
models represented in terms of Petri nets. 

By contrast, as we are mainly interested in structuring and exploiting design 
knowledge, in our approach the definition of specialization between process mod- 
els is not necessarily bound to a rigid notion of behavioral inheritance, so leaving 
more freedom to the designer about the meaning of all the concepts and relation- 
ships in the knowledge base. In a sense, our framework tries to take advantage 
of ideas from all the above mentioned process modeling perspectives, in order to 
provide a complete and effective support to designers. 

The formal framework for process modeling has been implemented in a pro- 
totype system that can assist the user in both design and analysis tasks, by 
providing a rich and integrated set of modeling, querying and reasoning facili- 
ties. In the paper, we discuss the system architecture and focus on some of its 
advanced functionalities, such as consistency checking and interactive ontology 
navigation. We devote particular emphasis to the description of a module for 
the automatic (re)discovering of process models. Actually, some f . i i 

techniques[l,5] were recently introduced to derive a model for a given process 
based on its execution logs. Here, we extend such approaches to extract hierar- 
chical process models, to be profitably integrated into our ontological framework. 

2 Process Modeling Framework 

In this section, we present a modeling framework, where process models can be 
specified in terms of ontology concepts, and can be related among each others, 
to facilitate the reuse and consolidation of design knowledge. 



2.1 Process Schemata 

The basis for a semantic view of a process model is the ontological description 
of the activities and domain concepts which it involves. 

Let A be a set of » » * . An activity ontology Oa for A is a tuple 
ISA, PARTQF such that ISA A A and PARTOF 2 A A, where 2 A de- 

notes the set of all the subset of activities, such that for each a A, there exists 



An Ontology-Driven Process Modeling Framework 



15 



no A! 2 a such that a A' and A' PARTOFA. Roughly speaking, the relation 

a ISA 6, for two activities a and b indicates that a is a refinement of b , while 
A! PARTOF a for A! A specifies that a consists in the execution of all the “finer” 
activities in A'. Hence, we say that a A is a , * * if there exists 

A ' A such that A! PARTOF a; otherwise, a is said * , . 

Practically, the relation PARTOF describes how a process can be broken down 
(or “decomposed”) into sub-activities. Moreover, the relation ISA allows the 
designer to specialize a given activity. Some major issues related to the special- 
ization of complex activities are discussed in the next subsection. 

* , In order to make clear our approach, we shall use the following 

example throughout the paper. Assume that a process model has to be designed 
to handle customers’ orders in a company. The first step is to define the activities 
that must be carried out in the business cases. To this aim the ontology Oa 
includes the .... activity, which, in turn, consists of the following 

ones: (a) receive an order, (b) authenticate the client, (c) check the product 
availability, (d) ship the product, (e) send an invoice. □ 

Let D be the domain of our application, and let On be a domain ontology. 
The* . of an activity a in D is a pairl a = InPort a , OutPort a of set of 
concepts in D, where OutPort a specifies the result of the enactment of a, while 
InPort a specifies what is required for enabling a. 

In general, the input concepts required by a sub-activity either are produced 
by other activities in the process or are (external) inputs of the process itself. 
Similarly, the outputs of an activity can be delivered within or outside of the 
process. A more detailed description of the structure of a complex activity, in- 
cluding the input/output dependencies between the involved sub-activities, can 
be obtained by the following notion of, - . . 

Definition 1 (Process Schema). Let Oa be an activity ontology, O r> be a 
domain ontology, and a be an activity in A. A t . . VS a for a is a 

tuple J, T, ao, F, CT, IN, OUT min , 0UT max , such that: 

I is the interface of a (i.e., I = T a = InPort“, OutPortA ); 

— T is a set of activities s.t. T PARTOF a is asserted in Ox, 

— a o A is the starting activity and F A is the set of final activities; 

— CT ', referred to as . . , . of VS a , is a relation of precedences 

among activities s.t. CT (A F) (A {ao ) and E CT + is s.t. 

(x, y) E implies that InPortA OutPortA = , and 

for each y T and for each c InPort y , either (i) c InPort a or (i) 

there exists (z,y) E s.t. c OutPortr 

for each c OutPortA, there exists x T s.t. c OutPortA. 

— IN, 0UT m j„, and 0UT mQX are three functions assigning to each activity in 

A a a natural number such that (i) IN(ao) = 0, a F, (ii) 0UT m j„(a) = 

0UT mox (a) = 0, and (iii) x A a , 0 < IN(x) . (x) and 0 < 

0UT mi „(a) OUT 

max (&) - (&) 

where CT + denotes the transitive closure of CT ', 
e CT I and . (x) is \{e = (x,z) I e CT I. 



(x) is \{e = ( y,x ) | 



16 



Gianluigi Greco et al. 



Intuitively, for any activity having a significant level of complexity, a process 
schema allows us to define the involved sub-activities, with their mutual informa- 
tion flow. For instance, the process schema for the . activity 

is shown in Figure i . is the starting activity while i 

is a final one. The values for IN and 0UT m j n are also reported, while any 0VT max 
value is assumed to coincide with the out-degree of the associated activity. 

The informal semantics of a process schema is as follows. An activity a can 
start as soon as at least IN(o) of its predecessor activities are completed. Two 
typical cases are: (i) if IN(a) = . (a) then a is an i activity, 

for it can be executed only after all of its predecessors are completed, and 
(ii) if IN(o) = 1 is called . i activity, for it can be executed as soon as 
one predecessor is completed. As commonly assumed in the literature, we con- 
sider only » and . * activities: Indeed, by means of these two ele- 

mentary types of nodes, it is possible to simulate the behavior of any activity 
a such that 1 < IN(a) < (a). Once finished, an activity a activates 

any non-empty subset of its outgoing arcs with cardinality between 0UT m j„(a) 
and 0UT max (a). If 0UT max (a) = . (a) then a is a . and if also 

0UT m i„(a) = 0UT maa; (a) then a is a . i i * . , as it activates all of its 

successors. Finally, if 0UT max (a) = 1 then a is an * . (also called 

. ), for it activates exactly one of its outgoing arcs. 



2.2 Process Schema Inheritance 

Specialization/inheritance relationships are a mean for structuring process 
knowledge into different abstraction levels. Indeed, they allow for organizing 
a set of related process schemata into a taxonomy, i.e. an acyclic graph where 
each node corresponds to a concept more general than those associated with 
its children. Undoubtedly, such a structure can help in effectively exploiting the 
design knowledge encoded in the involved process models. A key point here is 
what is the meaning of specialization for process schemata. Diverse notions of 
specialization were defined in several contexts, e.g., OO-Design/Programming 
[9,4,16], Enterprise Modeling [10,14], and Workflow Modeling[12]. The question 
is particularly intriguing if one looks at the behavioral features expressed by a 
process schema, representing a finite set of legal executions. 

A behavioral notion of inheritance, presented in [14], w.r.t. dataflow models, 
states that all the execution instances of a schema must also be instances of 




Fig. 1. Process schema for the 



activity 



An Ontology-Driven Process Modeling Framework 



17 



any schema generalizing it. A different meaning of inheritance is adopted in [3], 
where two basic notions are defined w.r.t. a special kind of workflow models (a 
class of Petri Nets, called ). In particular, [3] states that the external 

behaviors exhibited by a schema and by any of its specializations must not 
be distinguished whenever: ( ) only common activities are performed (, . 
i . ,i , a sort of “invocation consistency” [16]), or ( ) one abstracts from 
activities which are not in the original schema (, . i » - .i , a sort of 
“observation consistency” [16]). 

We believe that any of these notions could be more or less suitable to the 
given application context, and there is not a best one among them. Therefore, we 
prefer to leave the designer free of specializing a process model in different ways. 
In general, a new model could be derived from one or more existing models, by 
specializing functional and/or behavioral features (e.g., input, output, activities, 
dependencies and constraints on activities). 

Let IPS' be a schema. A schema TP S 1 is said a specialization of IPS if it is 
obtained by one of the following operations: 

— , t * i in the original schema. An activity A in IPS is re- 

placed with an activity A 1 representing a specialization of A in the ontology 
Oa- Note that the inverse derivation is not allowed, that is no activity of 
IPS can be a specialization of some activity in IPS 1 . 

i ii A of IPS. Removing an activity corresponds to exclude 

any process execution involving it, or, equivalently, to add further constraints 
to the process schema. Obviously, deletions are legal only if both the initial 
activity and at least one of the final activities are left. 

— i ii A to TP S. 

— ( i i i . i expressed TPS, by either removing links 

(and/or weakening some constraints), or adding further links between the 
activities (and/or constraints over them). 

— ,iii - i . of the complex activity modeled by TP S. 

Note that, as adding an activity to TPS corresponds to deleting an activity 
from TPS 1 , we could rather consider TPS as a specialization of TPS 1 , thus ap- 
parently getting a contradiction. But the contradiction is only apparent as two 
opposite abstractions cannot be asserted at the same time: the designer makes 
a choice between the two alternatives. Moreover, we observe that some of the 
above operations may lead to specializations which are “unsafe” w.r.t. some of 
the inheritance notions discussed above. For example, adding an activity is “un- 
safe” w.r.t. the inheritance notion in [14], as it admits executions which were 
not captured by the original schema. Such an inconsistence, however, could be 
temporarily allowed as an exception in a hierarchy of concepts (process models) 
which is unable to suitably fit the new concept (derived process model). So, 
later on, the hierarchy should be restructured in such a way that there is not 
need to include exceptions anymore. To this aim, our system is equipped with 
facilities for recognizing and recovering inconsistencies, w.r.t. the chosen notion 
of inheritance, while a taxonomy of process models is being built. 



18 



Gianluigi Greco et al. 



Different examples of specialization for the sample process . 
are depicted in Figure 2, where: the process . . » - . 1 , is obtained 

by deleting the “ship product” activity; the process . . < . . * . is 

obtained by adding the “insert term of payment” activity at the more general 
process; and finally, the “client authentication” activity is replaced with a more 
specific one ( “credit card authentication” ) in the . . * . .1 . . 




Fig. 2. An example of “temporary” specialization hierarchy 



3 System Architecture 

This section illustrates the main features of a software system (implemented 
in JAVA), supporting the design, analysis and usage of process models. From 
a conceptual point of view, the system architecture, sketched in the right side 
of Figure 3, is centered upon a rich knowledge base, which stores a semantic 
description of the processes, according to the framework presented above. More- 
over, a set of modeling, querying and reasoning tools is provided, which allow 
to build and extend this knowledge base, as well as to exploit it in several tasks 
of a process model’s life cycle, such as: f ) defining or re-engineering a process 
model and its components, ft ) specializing or generalizing existing models, ftt ) 
checking the workflow schema of a process and f ) analyzing its behavior. The 
main modules in the architecture are the followings: 

The XML repository represents the core of the system knowledge base. It is 
a native XML database managing the representation of both process schemata 
and execution instances, encoded in an XML-based format. Notably, all the 



An Ontology-Driven Process Modeling Framework 



19 




Enactment/Ontology 
Engines < - 



Ontology 

Import/Export 



[ Restructuring ] 
f Clustering 1 
Process Miner 





Cosiskncv 


' 




Checker 






Simulator 





Fig. 3. System Architecture (.» . ) and a screen-shot of the user interface ( ) 



semantic relationships involving schemata, activities and other domain entities 
are explicitly stored in the repository. 

The Ontology I/O module offers mechanisms for connecting to, browsing 
and importing parts of an external ontology, provided that this exposes itself in 
the Web Ontology Language (OWL) [15], a semantic markup language by the 
World Wide Web Consortium. In addition, the module allows to make available 
contents of the knowledge base to the outside of the system as an ontology, still 
adopting the standard OWL format. 

The WF I/O module provides the ability of translating a given process 
schema into an executable specification to be enacted by a suitable engine. In 
the current implementation of the system, Business Process Execution Language 
(BPEL) [8] has been chosen as such a specification language, mainly because 
this XML-based language represents a widely accepted notation for describing 
processes, fully integrated with the Web Services technology, while run-time 
environments supporting it are become available. 

The Consistency Checker is in an early stage of development, and is 
intended to provide a number of facilities for analyzing the defined process 
schemata. Currently, the module allows the user to assess the syntactic and 
semantic correctness of a designed process model, by providing automatic sup- 
port to consistency check and schema validation analysis regarding both the 
static features of a model and its dynamic behavior. Further, we intend to give 
further support to the analysis of process behaviors, by developing a Simulation 
engine to simulate the execution of a given process in various situations. Some 
interesting applications of such a tool might be the investigation of the process 
model by means of “what if” scenarios and the comparison of alternative design 
choices. Details on the techniques we plan to exploit in the development of such 
an engine can be found in a previous work [7]. 

The User Interface, a screen-shot of which is shown on the left side of Fig- 
ure 3, enables the system to be used in an easy and effective way. Notably, the 
whole content of the knowledge base can be accessed by users through a general- 
purpose query engine associated with the . f % . . Moreover, the explo- 

ration of such data is made easier by exploiting the taxonomical structures in 
which the various kinds of concepts are organized, according to the specialization 





20 



Gianluigi Greco et al- 



and partonomy relationships which relate them (look at the tree-like structure 
on the left side of the screen-shot). 

The Process Miner module is substantially devoted to enable the automatic 
derivation of process models, based on induction techniques. Therefore, it can 
be of great value to the design of process models, specially when complex and 
heterogenous behaviors are to be modeled. It is composed of two separate com- 
ponents, i.e.,_ . .* and .1 modules, whose functionalities will 

be described in the next section, since this is a key module paving the way for 
an effective usage of the whole approach. 



3.1 Building and Using a Process Model Knowledge Base 

This section describes the core techniques implemented in the . % 

module, which can be profitably used in the re-design and analysis of process 
models. Notably, these tools can be very useful when modeling processes with 
complex and unexpected dynamics, which would require expensive and long 
analysis for a complete design. To this aim, a sample of executions is exploited 
to build a hierarchy of process schemata conforming to our framework, which 
model the behaviors of the underlying process at different refinement levels. 

In order to better explain our approach, we first introduce some preliminary 
definitions and notation. Let Ap be the set of identifiers denoting the activities 
involved in a given process P. A . . s . Ap is a string in Ap, 

representing a sequence of activities, while a . . P, denoted by Cp, 

is a bag of traces over Ap. Then, a set of traces produced by past enactments of 
the process P is examined to induce a hierarchy of process schemata representing 
the behavior of P at different levels of refinement. 

The algorithm ProcessDiscover, shown in Figure 4, starts with a preliminary 
model W5g, which only accounts for the dependencies among the activities in 
P. Then it refines the current schema in an iterative and incremental way, by 
exploiting a suitable set of features, which allow to different behavioral patterns. 
The result of the algorithm is a taxonomy of schemata, that we actually represent 
as a set of schemata, where each schema WS { is identified by its level i in the 
hierarchy (i.e., the number i of refinements required to produce the schema) and 
by the position j where it occurs inside that level. 

The schema WcJq is computed by mining a control flow CT G , according to a 
minimum support threshold <7, through the procedure 1 , . , mainly 

exploiting techniques already presented in the literature (see, e.g., [1,13 ]). WSq 
is then inserted in T, and the algorithm starts partitioning it. After the initial- 
ization described above, the algorithm performs two separate phases. 

In the first phase, the taxonomy of schemata is built through a top-down re- 
finement of each schema, implemented by the recursive procedure ( . 1 1 . The 

procedure , . 1 1 mainly relies on identifying different patterns of executions 
by means of an algorithm for clustering the process traces £(WSj) associated 
with each element WSj in the hierarchy. It is based on projecting these traces 
onto a set of properly defined . . Thus, in order to reuse well know cluster- 



